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Abstract 

We study logit dynamics [5] for strategic games. This dynamics works as follows: at every stage 
of the game a player is selected uniformly at random and she plays according to a noisy best-response 
where the noise level is tuned by a parameter /3. Such a dynamics defines a family of ergodic Markov 
chains, indexed by /3, over the set of strategy profiles. We believe that the stationary distribution 
of these Markov chains gives a meaningful description of the long-term behavior for systems whose 
agents are not completely rational. 

Our aim is twofold: On the one hand, we are interested in evaluating the performance of the game 
at equilibrium, i.e. the expected social welfare when the strategy profiles are random according 
to the stationary distribution. On the other hand, we want to estimate how long it takes, for a 
system starting at an arbitrary profile and running the logit dynamics, to get close to its stationary 
distribution; i.e., the mixing time of the chain. 

In this paper we study the stationary expected social welfare for the 3-player CK game [6], for 
2-player coordination games, and for two simple n-player games. For all these games, we also give 
almost tight upper and lower bounds on the mixing time of logit dynamics. Our results show two 
different behaviors: in some games the mixing time depends exponentially on /3, while for other 
games it can be upper bounded by a function independent of p. 

1 Introduction 

The evolution of a system is determined by its dynamics and complex systems are often described by 
looking at the equilibrium states induced by their dynamics. Once the system reaches an equilibrium 
state it stays there, thus equilibrium states describe the long-term behavior of the system. In this paper 
we are mainly interested in systems whose individual components are selfish agents. The state of a selfish 
system is fully described by a vector of strategies, each controlled by one agent, and each state assigns a 
payoff to each agent. The agents are selfish in the sense that they pick their strategy so to maximize their 
payoff, given the strategies of the other agents. Nash equilibrium is the classical notion of equilibrium 
for selfish systems and it corresponds to the equilibrium induced by the best-response dynamics. The 
observation that selfish systems are described by their equilibrium states (that is, by the Nash equilibria) 
has motivated the notions of Price of Anarchy [15] and Price of Stability [I] and the analysis of efficiency 
of selfish systems based on such notions. 

However, such analysis inherits some of the shortcomings of the concept of a Nash equilibrium. First 
of all, the best-response dynamics assumes that selfish agents have complete knowledge of the current 
state of the system; that is, they know the payoff associated with each of their possible choices and each 
of the strategies chosen by other agents. Instead, in most cases, agents have only approximate knowledge 
of the system state or they are not able to compute their best choice. Moreover, in presence of multiple 
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equilibria, it is not clear which one of them will be reached by the system, as it may depend on the initial 
state: Price of Anarchy considers the worst-case equilibrium, whereas Price of Stability focuses on the 
best-case equilibrium. Finally, Nash equilibria are hard to compute [7J [5] and thus for some systems it 
might take very long to reach a Nash equilibrium: in this case using equilibrium states to describe the 
system performance is not well justified. Rather, one would like to analyze the performance of a system 
by using a dynamics (and its related equilibrium notion) that has the following three properties: 

• The dynamics takes into account the fact that the system components might have a perturbed or 
noisy knowledge of the system; 

• For every system the equilibrium state exists and is unique; 

• The system reaches the equilibrium very quickly regardless of the starting state. 

In this paper, we consider noisy best-response dynamics in which the behavior of the agents is de- 
scribed by a parameter (3 ^ 0. The case (3 = corresponds to agents picking their strategies completely 
at random (that is, the agents have no knowledge of the system) and the case /? = oo corresponds to 
agents picking their strategies according to the best-response dynamics (in which the agents have full and 
complete knowledge of the system) . The intermediate values of (3 correspond to agents that are roughly 
guided by the best-response dynamics but can make a sub-optimal choice due, for example, to bounded 
rationality of the agent or limited knowledge about the system: this sub-optimal behavior occurs with 
some probability that depends on (3 (and on the associated payoff). 

We will study a specific noisy best-response dynamics for which the system evolves according to an 
ergodic Markov chain for all j3 ^ 0. For these systems, it is natural to look at the stationary distribution 
(which is the equilibrium state of the Markov chain) and to analyze the expected social welfare (the sum 
of utility functions) of the system at that distribution. We stress that the noisy best-response dynamics 
well models agents that only have approximate or noisy knowledge of the system and that for ergodic 
Markov chains (such as the ones arising in our study) the stationary distribution is known to exist and 
to be unique. Moreover, to justify the use of the stationary distribution for analyzing the performance 
of the system, we will study how fast the Markov chain converges to the stationary distribution. 

Related Works and Our Results. Several dynamics, besides the best-response dynamics, and several 
notions of equilibrium, besides Nash equilibria, have been considered to describe the evolution of a selfish 
system and to analyze its performance. See, for example, [TT1 |2~T1 I2"U] . 

Equilibrium concepts based on the best-response. When the game does not possess a Pure Nash equi- 
librium, the best-response dynamics will eventually cycle over a set of states (in a Nash equilibrium 
the set is a singleton). These states are called sink equilibria 12 . Sink equilibria exist for all games 
and, in some contexts, they seem a better approximation of the real setting than mixed Nash equilibria. 
Unfortunately, sink equilibria share two undesirable properties with Nash equilibria: a game can have 
more that one sink equilibrium and sink equilibria seem hard to compute (9j. 

Other notions of equilibrium state associated with best-response dynamics are the unit-recall equilibria 
and component-wise unit-recall equilibria (see j^)- However, we point out that the former does not always 
exist and that the latter imposes too strict limitations on the players. 

No-Regret Dynamics. Another broadly explored set of dynamics are the no-regret dynamics (see, for 
example, [H]). The regret of an user is the difference between the long-term average cost and the average 
cost of the best strategy in hindsight. In the no-regret dynamics the regret of every player after t steps 
is o(t) (sublinear with time). In [Till HI] it is showed that the no- regret dynamics converges to the set of 
correlated equilibria. Note that the convergence is to the set of correlated equilibria and not to a specific 
correlated equilibrium. 

Our work. In this paper we consider a specific noisy best-response dynamics called the logit dynamics 
(see [3j) and we study its mixing time (that is, the time it takes to converge to the stationary distribution) 
and the stationary expected social welfare. Specifically, 

• We start by analyzing the logit dynamics for a simple 3-player linear congestion game (the CK 
game 6 ) which exhibits the worst Price of Anarchy among linear congestion games. We show that 
the mixing time of the logit dynamics is upper bounded by a constant independent of (3. Moreover, 
we show that the stationary expected social welfare is larger than the social welfare of the worst 
Nash equilibrium for all /3; 
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• We then analyze the 2x2 coordination games studied in [3J. Here we show that, under some 
conditions, the stationary expected social welfare is larger than the social welfare of the worst 
Nash equilibrium. We give upper and lower bounds on the mixing time exponential in (3. We also 
observe that the same bounds apply to anti-coordination games; 

• Finally, we apply our analysis to two simple n-player games: the OR game and XOR game. We 
give upper and lower bounds on the mixing time: we show that the mixing time of the OR game 
can be upper bounded by a function independent of /3, while the mixing time of the XOR game 
increases exponentially in j3. We also prove that for j3 = O(logn) the mixing time is polynomial 
in n for both games. 

The logit dynamics has been first studied by Blume [3] who showed that, for 2 x 2 coordination games, 
the long-term behavior of the Markov chain is concentrated in the risk dominant equilibrium (see |13] ) 
for sufficiently large j3. Ellison [8] studied different noisy best-response dynamics for coordination games 
assuming that interaction among players were described by a graph; that is, the utility of a player is 
determined only by the strategies of the adjacent players. Specifically, Ellison [5] studied interaction 
modeled by rings and showed that some large fraction of the players will eventually choose the risk 
dominant strategy. Similar results were obtained by Peyton Young |22j for the logit dynamics and for 
more general families of graphs. Montanari and Saberi [17] gave bounds on the hitting time (the expected 
time that the logit dynamics takes to reach a specific state) of the risk dominant equilibrium state in 
terms of some graph theoretic properties of the underlying interaction network. Asadpour and Saberi [2; 
studied the hitting time for a broader class of congestion games. We notice that none of [3J [HI HI] gave 
any bound on the convergence time to the risk dominant equilibrium. Montanari and Saberi [T7] were 
the first to do so but their study focuses on the hitting time of a specific configuration and not on the 
convergence time to the stationary distribution. 

From a technical point of view, our work follows the lead of [3J |H1 [22] and extends their technical 
findings by giving bounds on the mixing time of the Markov chain of the logit dynamics. We stress 
that previous results only proved that, for sufficiently large /3, eventually the system concentrates around 
certain states without further quantifying the rate of convergence nor the asymptotic behaviour of the 
system for small values of /3. Instead, we identify the stationary distribution of the logit dynamics as the 
global equilibrium and we evaluate the social welfare at stationarity and the time it takes the system to 
reach it (the mixing time) as explicit functions of (3. 

We choose to start our study from the class of coordination games considered in [3j and two simple 
n- player games (the OR game and the XOR game). We give nearly tight upper and lower bounds on 
the mixing time. Despite their game-theoretic simplicity, the analytical study of the mixing time of the 
logit dynamics for the two n-player games is far from trivial. We notice that the results in |17j cannot 
be used to derive upper bounds on the mixing time. 

From a more conceptual point of view, our work tries (similarly to [121 El H2]) to introduce a solution 
concept that well models the behavior of selfish agents, is uniquely defined for any game, and is quickly 
reached from any starting state. We propose the stationary distribution induced by the logit dynamics 
as a possible solution concept and exemplify its use in the analysis of the performance of some 2x2 
games (as the ones considered in [3J), of games used to obtain tight bounds on the Price of Anarchy, and 
of two simple multi-player games. 

Organization of the paper. In Section [5] we summarize some Markov chain notions that we will use 
throughout the paper. In Section [3J we formally describe the logit dynamics for strategic games. We 
also describe the coupling we will repeatedly use in the proofs of the upper bounds on mixing times. In 
Sections [4j \E\ \6\ and [7] we study the stationary expected social welfare and the mixing time of the logit 
dynamics for CK game, coordination games, the OR game, and the XOR game, respectively. Finally, in 
Section [5] we present conclusions and some open problems. 

Notation. We write S for the complementary set of a set S; we write 15*1 for its size. We use bold symbols 
for vectors; when x = (x\, . . . , x n ) € {0, 1}™ we write |x| for the number of Is in x; i.e., |x| = \{i € [n] : 
Xi = 1}|. For two vectors x, y let if(x, y) = |{i 6 [n] : Xi ^ yi}\ be their Hamming distance: we write 
x ~ y if H (x, y) = 1. We use the standard game theoretic notation (x_j, y) to mean the vector obtained 
from x by replacing the i-th entry with y, i.e. (x_j, y) — (x\, . . . , Xi-\,y, Xi+\, . . . , x n ). 
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2 Markov chains summary and notation 



We summarize the main tools we use to bound the mixing time of Markov chains (for a complete 
description of such tools see, for example, Chapters 5.2, 7.2, 12.2 and 14.2 of [TB]. We refer the reader 
to [12] also for notational conventions). 

Consider a Markov chain M. with finite state space 57 and transition matrix P. It is a classical result 
that for an irreducible and aperiodic Markov chairQ there exists an unique stationary distribution tt over 
57; that is, a distribution n on 57 such that tt ■ P = it. 

The total variation distance H|tv between two probability distributions /i and v on 57 is defined as 

||M-^I|tv = max|/i(A) - v{A)\ . 

An irreducible and aperiodic Markov chain M. converges to its stationary distribution 7r; specifically, 
there exists 1 > a > such that 

d(t) < a*, 

where 

= max || P* (a;,-) - 7r|| T v 

and P t (x, ■) is the distribution at time t of the Markov chain starting at x. For 1/2 > e > 0, the mixing 
time is defined as 

imix(e) = min{£ e N : <2(i) < e}. 

It is usual to set e = 1/4 or e = l/2e. If not explicitly specified, when we write i m i x we mean i m i x (l/4). 
Observe that t mix (e) < |~log 2 e _1 ]t mix . 

Coupling. A coupling of two probability distributions fj, and v on 57 is a pair of random variables 
(X, Y) defined on 57 x 57 such that the marginal distribution of X is fi and the marginal distribution 
of Y is v. A coupling of a Markov chain M with transition matrix P is a process (A 4 , Yt)^ with the 
property that both X t and Y t are Markov chains with transition matrix P. When the two coupled chains 
start at (Xq,Yq) — (x,y), we write ~P x ,y (•) and ~Ei x ,y [•] f° r the probability and the expectation on the 
space where the two chains are both defined. 
We denote by r coup i G the first time the two chains meet; that is, 

^couple = min{£ : X t = Y t }, 

We will consider only couplings of Markov chains with the property that for s T coup i c , it holds X s = Y s . 
The following theorem can be used to give an upper bound on i m ; x (see, for example, Corollary 5.3 in 

M)- 

Theorem 1 (Coupling) Let M be a Markov chain with state space fl and transition matrix P. For 
each pair of states x,y € CI consider a coupling (X t ,Y t ) of M. with starting states Xq = x and Yq = y. 
Then 

d(t) < max P x y (T coup i e > t) . 

Sometimes it is difficult to specify a coupling and to analyze the coupling time r coup i e for each pair of 
starting states x and y. The Path Coupling theorem says that it is sufficient to define a coupling only 
for pairs of Markov chains starting from adjacent states and an upper bound on the mixing time can 
be obtained if each of these couplings contracts their distance on average. More precisely, consider a 
Markov chain M. with state space and transition matrix P; let G — (il, E) be a connected graph and 
let w : E — » R be a function assigning weights to the edges such that w(e) ^ 1 for every edge e € E; 
for x,y £ 57, we denote by p{x,y) the weight of the shortest path in G between x and y. The following 
theorem holds. 

1 Roughly speaking, a finite-state Markov chain is irreducible and aperiodic if there is a time t such that, for all pairs of 
states x, y, the probability to be in y after t steps, starting from x, is positive. 
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Theorem 2 (Path Coupling |4j) Suppose that for every edge {x,y} £ E a coupling (X t ,Y t ) of M. 
with Xq = x and Yq = y exists such that E x , 9 [p(X\,Yi)\ ^ e~ Q • w({x,y}) for some a > 0. Then 

\og(diam(G)) +log(l/e) 
a 

where diam(G) is the (weighted) diameter of G. 



Spectral techniques. A Markov chain M with state space Q, and transition matrix P is said reversible 
if for all x, y £ f2, it holds that 

tt(x) • P(x, y) = w(y) ■ P(y, x). 

The eigenvalues of the transition matrix P of a reversible Markov chain M. can be used to obtain upper 
and lower bounds on the mixing time. Observe that all the eigenvalues of any transition matrix P have 
absolute value at most 1, A = 1 is an eigenvalue, and for irreducible and aperiodic chains, —1 is not an 
eigenvalue. The relaxation time t xc \ of a Markov chain M is defined as 

1 

^rcl 



1-A* 

where A* is the largest absolute value among eigenvalues other than 1, 

A* = max{|A| : A is an eigenvalue of P, A ^ 1} . 

Observe that, for M. reversible, irreducible and aperiodic, ^ A* < 1 and thus t TC \ is positive and finite. 
We have the following theorem (see, for example, Theorems 12.3 and 12.4 in [To]). 

Theorem 3 (Relaxation time) Let P be the transition matrix of a reversible, irreducible, and aperi- 
odic Markov chain with state space Q and stationary distribution n. Then 



(tret ~ 1) log ( — ) < t mlx (e) < log 



'■ret 



where n mln = Tam xen n(x). 



Lower bound. We will use the following theorem to derive our lower bounds (see, for example, The- 
orem 7.3 in |16|). 

Theorem 4 (Bottleneck ratio) Let M. = {X t : isM} be an irreducible and aperiodic Markov chain 
with finite state space £1, transition matrix P, and stationary distribution n. Let S C Q be any set with 
tv(S) ^ 1/2. Then the mixing time is 

l-2e 

t mix (e) > 

where _ 

* (5) = ^|f and Q(S,S)= Yl *(x)P(x,v)- 

x£S, y€S 



3 The model and the problem 

A strategic game is a triple ([n], S,IA), where [n] = {1, . . . , n} is a finite set of players, S = {Si, . . . , S n } 
is a family of non-empty finite sets (Si is the set of strategies for player i), and U = {u\, . . . ,u n } is a 
family of utility functions (or payoffs), where Ui : Si x • ■ ■ x S n — > R is the utility function of player i. 

Consider the following noisy best-response dynamics, introduced in [3] and known as logit dynamics: at 
every time step 

1. Select one player i £ [n] uniformly at random; 
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2. Update the strategy of player i according to the following probability distribution over the set Si 
of her strategies. For every y € Si 



^(y|x) = ^ye^ x -^) (1) 

where x G Si X • • • X S n is the strategy profile played at the current time step, Tj(x) 
= S^gS e^ Ui ^ :x ~ i ' z ' is the normalizing factor, and (3 ^ 0. 

Parameter /? is called inverse noise of the system, indeed from ([T]) it is easy to see that, for /3 = player i 
selects her strategy uniformly at random, for /3 > the probability is biased toward strategies promising 
higher payoffs, and for /3 — > oo player i chooses her best response strategy (if more than one best response 
is available, she chooses uniformly at random one of them). Moreover observe that probability o~i(y | x) 
does not depend on the strategy Xi currently adopted by player i. 

The above dynamics defines a Markov chain with the set of strategy profiles as state space, and 
where the transition probability from profile x = [x\, . . . ,x n ) to profile y = (j/i, . . . ,y n ) is zero if the 
if (x, y) ^ 2 and it is ^o~i(yi | x) if the two profiles differ exactly at player i. More formally, we can define 
the logit dynamics as follows. 

Definition 5 (Logit dynamics |3j) Let Q = ([n],S,U) be a strategic game and let (3^0. The logit 
dynamics for Q is the Markov chain M.p = {X t : t G N} with state space Cl = Si X • • • X S n and transition 
matrix 

!Oi{yi | x), if y_j = x_i and yi^Xi] 

Er=i°*(^l x )> */y = x ; (2) 
0, otherwise; 

where o~i(yi | x) is defined in (Qp. 



Properties. Logit dynamics enjoys some interesting properties: 

Ergodicity. It is easy to see that the logit dynamics is irreducible and aperiodic. Indeed, let x = 
(xi, . . . , x n ) and y = (yi, . . . , y n ) be two profiles and let (z°, . . . , z") be a path of profiles where z° = 
x, z" = y and z 4 = (yi, . . . ,yi, Xj+i, . . . x n ) for i = 1, . . . , n — 1. The probability that the chain starting 
at x is in y after n steps is 

P n (x,y) = P"(z°,z n ) ^ P"- 1 (z°,z™- 1 )P(z™- 1 ,z™) 

and recursively 

n 

1=1 

where the last inequality follows from ([5]) because, for all i = 1, . . . , n, the Hamming distance between 
z 4_1 and z' is at most 1. Hence there is a unique stationary distribution ir and, for every starting profile 
x, the distribution of the chain P'(x, •) converges to ir in total variation as t tends to infinity. 

Invariance under utility translation. Let Q — ([n],5,Z-/) be a game. If we change the utility functions by 
adding a constant Ci to all the utilities of player i, i.e. if we define a new family U = {Hi : i G [n]} of 
utility functions as follows 

Mi(x) := Ui(x) + Ci for all x 

we get a new game Q = ([n],S,U) but the same logit dynamics. Indeed, according to ((T|), the probability 
player i chooses strategy y when the game is at profile x is 

a i{y I x ) = p f3u % (x-i,z) = V „^[f ll (x_ i ,z)-« i (x_ i ,j,)] = V „^[« i (x_ il z)-u i (x_ i ,y)] = I X ) " 

Noise changes under utility reseating. While translations of utilities do not affect logit dynamics, a 
rescaling of the utility functions for the same constant a > changes the inverse noise from /3 to a ■ f3. 
Indeed, if for every player i and every profile x we set 

Mi(x) := a ■ Uj(x) , 



G 



from ([T]) we have 

e p-&i(x-i,y) e a.fSv,i(x.-i,y) 

Notice that, unlike translations constants, we here must have the same rescaling constant a for all utility 
functions. 

Potential Games. A game Q = ([n],S,U) is said a (exact) potential game if a function $ : Si x 
■ • • x S n — > R exists such that, for every player i and for every pair of profiles x and y that differ only 
at position i, it holds that u^(x) — u^(y) = $(x) — $(y). It is easy to see that, if Q = {[n],S,U) is 
a potential game with potential function $, then the Markov chain given by ^ is reversible and its 
stationary distribution is the Gibbs measure 

tt(x) = ie^ x ) (3) 

where Z = E y eSix - xs e* 3 *^-* is the normalizing constant. Except for the Matching Pennies example 
in Subsection (37TJ all the games we analyze in this paper are potential games. 

Logit dynamics vs Glauber dynamics. When Q is a potential game, the logit dynamics is equivalent to 
the well-studied Glauber dynamics. For state space il = Sx X • • • x S n and probability distribution \i over 
f2, the Glauber dynamics for [i proceeds as follows: From profile x £ CI, pick a player i £ [n] uniformly 
at random and update her strategy at y £ Si with probability fi conditioned on the other players being 
at x — 25 i.e. 

A*(x-i,y) 



Ky\*--i) 



It is easy to see that the Markov chain defined by the Glauber dynamics is irreducible, aperiodic, and 
reversible with stationary distribution /i. When Q — ([n] , S,li) is a potential game with potential function 
the logit dynamics defines the same Markov chain as the Glauber dynamics for the Gibbs distribution 
7r in ([3]). Indeed, in that case we have 



0i(y|x) = 



1 e m*-i,v) 7r(x_,,y) 



Hence, logit dynamics for potential games and Glauber dynamics for Gibbs distributions are two ways 
of looking at the same Markov chains: in the former case the dynamics is derived from the potential 
function, in the latter case from the stationary distribution. However, observe that, if Q is not a potential 
game and 7r is the stationary distribution of the logit dynamics for Q 1 in general the Glauber dynamics 
for 7r is different from the logit dynamics (see, for example, the Matching Pennies case in Subsection l3.ip . 

Due to the analogies between logit and Glauber dynamics, we will sometimes adopt the terminology 
used by physicists to indicate the quantities involved; in particular we will call parameter /3 the inverse 
noise or inverse temperature and we will call partition junction the normalizing constant Z of the Gibbs 
distribution (J3]). 



Stationary expected social welfare and mixing time. Let W : Sx x ■ ■ ■ x S n — > K be a social 
welfare function (in this paper we assume that W is simply the sum of all the utility functions W(x) = 
ET=i u i( x )i but clearly any other function of interest can be analysed). We study the stationary expected 
social welfare, i.e. the expectation of W when the strategy profiles are random according to the stationary 
distribution tt of the Markov chain, 

E* [W] = J2 W{x)ir(x) 

xGSiX---xS„ 

Since the Markov chain defined in @ is irreducible and aperiodic, from every initial profile x the 
distribution P'(x, •) of chain X t starting at x will eventually converge to it as t tends to infinity. We will 
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be interested in bounding how long it takes to get close to the stationary distribution, that is the mixing 
time of the Markov chain. 

In the next subsection we illustrate the goals of our work with two simple examples. 



3.1 Two simple examples: Matching Pennies and a Stairs game 



Matching Pennies. Consider the classical Matching Pennies game. We write the utility functions in 
the standard bimatrix form. 





H 


T 


H 


+1, -1 


-1, +1 


T 


-1, +1 


+1, -1 



(4) 



According to (JXJ) , the update probabilities for the logit dynamics are, for every x £ {H, T} 



a 1 (H\(x,H))=a 1 (T\(x,T)) = 
a 1 {T\(x,H))=a 1 (H\(x,T)) = 
Hence the transition matrix (see ©) is 



l+e 



( 



P 



HH 
HT 
TH 
V TT 



HH 



HT 



= a 2 (T\(H,x))=a 2 (H\(T,x)), 
= a 2 (H\ (H, x)) = <r 2 (T | (T, x)) . 



TH 



1/2 
(l-6)/2 
6/2 




6/2 
1/2 


(l-6)/2 



TT \ 



(l-6)/2 


1/2 
6/2 





6/2 
(l-6)/2 
1/2 



where, for readability sake, we named 6 = 1+e 1 - 2) 3 ■ 

Since every column of the matrix adds up to 1, the uniform distribution 7r over the set of strategy 
profiles is the stationary distribution for the logit dynamics. The stationary expected social welfare is 
thus for every inverse noise /3. 

As for the mixing time, it is easy to see that it is upper bounded by a constant independent of /3. 
Indeed, a direct calculation shows that, for every x e {HH, HT, TH, TT} and for every /3 ^ it holds 
that 

||P 3 (x,.)-^|| T v^<i. 



A stairs game. One of the main techniques used to give upper bounds on the mixing time of Markov 
chains is the coupling technique (see Theorem [1]). In the following example we use it to upper bound the 
mixing time of the logit dynamics for a simple game. 

Let Q be a potential game where every player has two strategies, say upstairs (or 1) and downstairs 
(or 0), and the potential of a profile x e {0, 1}™ is the number of players that are upstairs, i.e. 3>(x) = |x|. 

Notice that the logit dynamics (and thus the stationary distribution and the mixing time) is com- 
pletely defined by the potential function, while if we wanted to evaluate the stationary expected social 
welfare we would need to specify the utility functions. 

The partition function is 

x6{0,l}" k=0 ^ ' 

So the stationary distribution is 

e /3|x| 



As for the mixing time, we can use the coupling technique as follows: observe that the probability of 
playing strategy 1 (or equivalently strategy 0), for the player selected for the update, is independent of 
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the current strategies of the other players. Indeed, according to ((T|), for every x it holds that 

1 

1 + e /3(«i(x-i,0)-M<(:x:-i.l)) 

1 _ 1 

I + e ^(*(x_ i ,0)-*(x_ i ,l)) ~ 1 +e /3(|x_ i |-(|x_ i | + l)) ~ \ + e -l3- 

We can define a coupling of two Markov chains starting at two different profiles as follows: choose % G [n] 
uniformly at random and perform the same update at player i in both chainfl When every player has 
been chosen at least once the two chains have coalesced. From the coupon collector's argument, it takes 
0(nlogn) to have that, with probability at least 3/4, all players have been chosen at least once. By 
applying Theorem [T] we have that the mixing time is O{n\ogn). 

In the above examples, it turned out that the mixing time of the logit dynamics can be upper bounded 
by functions that do not depend on the inverse noise /?. As we shall see in the next sections, this is not 
always the case. Moreover, the analysis of the mixing time is usually far from trivial. 

3.2 Description of the Coupling 

Throughout the paper we will use the coupling and path-coupling techniques (see Theorem [T] and Theo- 
rem[2) to give upper bounds on mixing times. Since we will use the same coupling idea in several proofs, 
we describe it here and we will refer to this description when we will need it. 

Consider an n-player 2-strategy game Q and let us rename and I the strategies of every player. For 
every pair of strategy profiles x = (xi, . . . , x n ), y = (yi, . . . , y n ) € {0, 1}™ we define a coupling (Xi,Yi) 
of two copies of the Markov chain with transition matrix P defined in @ for which Xo = x and Yq = y. 

The coupling proceeds as follows: first, pick a player i uniformly at random; then, update the 
strategies Xi and yt of player i in the two chains, by setting 

(0, 0), with probability min{crj(0 | x), 0^(0 | y)} ; 

(I, I), with probability min{cri(I | x), er 4 (I | y)} ; 

(0, 1), with probability (7^(0 | x) — minjcr^O | x), cr^O | y)} ; 

(1,0), with probability 0^(1 | x) — min{(7i(I | x),(Ti(l | y)} . 

Three easy observations are in order: if <Ji(0 | x) = <7i(0 | y) and player i is chosen, then, after the update, 
we have Xi = yf, for every player i, at most one of the updates (x^, yi) = (0, 1) and {xi, yi) = (1, 0) has 
positive probability; if i is chosen for update, then the marginal distributions of Xi and yi agree with 
<7i(- | x) and Ui{- | y) respectively, indeed, for b £ {0, 1}, the probability that Xi = b is 

min{er j; (& | x),f7i(6 | y)} + <r l {b \ x) - min{cr i (6 | x),CTj(6 | y)} = di(b \ x) , 

and the probability that yi = b is 

vain{a l {b | x),cri(6 | y)} + ^(1 - b | x) - min-jV^I - b | x),<r 4 (I - b | y)} = 

= min{(7j(6 | x),<7 i (6 | y) + (I — a t (b \ xj) - (I - max{o- 4 (6 | x),ai(b \ y)}) = a^b \ y) . 

We define G — (£l,E) as the Hamming graph of the game, where f2 = {0, 1}™ is the set of strategy 
profiles, and two profiles x = (xi, . . . , x n ), y = (y±, . . . , y n ) S O are adjacent if they differ only for the 
strategy of one player, i.e. 

{xj}€£ <=> x^y. (5) 

For the path coupling technique, the coupling described above is applied only to pairs of adjacent starting 
profiles. 

2 This is the same coupling used in the analysis of the lazy random walk on the hypercube (e.g. see Section 5.3.3 in 
the only difference being that the probability of choosing or 1 is not 1/2, 1/2 but 1/(1 + e' 3 ), 1/(1 + e~@) 



(7,(1 | X) 



e /3«i(x_i,l) 
e ^Ui(x_i,l) _|_ e /3«i(x_i,0) 
I 
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4 A 3-player congestion game 



In this section we analyze the CK game, a simple 3-player linear congestion game introduced in [5J. 
This game is interesting because it highlights the weakness of the Price of Anarchy notion for the logit 
dynamics. Indeed, the CK game exhibits the worst Price of Anarchy with respect to the average social 
welfare among all linear congestion games with 3 or more players. But, as we shall see soon, the stationary 
expected social welfare of the logit dynamics is always larger than the social welfare of the worst Nash 
equilibrium and, for large enough (3, players spend most of the time in the best Nash equilibrium. 
Moreover, we will show that the mixing time of the logit dynamics can be bounded independently from 
j3: that is, the stationary distribution guarantees a good social welfare and it is quickly reached by the 
system. 

Let us now describe the CK game. We have 3 players and 6 facilities divided into two sets: G = 
{ffii 92, 53} and H = {hi, hi, h 3 }. Player i £ {0, 1, 2} has two strategies: Strategy "0" consists in selecting 
facilities (gi, hi); Strategy "1" consists in selecting facilities (<?j+i, hi-%, (index arithmetic is modulo 

3). The cost of a facility is the number of players choosing such facility, and the welfare of a player is 
minus the sum of the costs of the facilities she selected. It easy to see that this game has two pure Nash 
equilibria: the solution where every player plays strategy (each player pays 2, which is optimal), and 
the solution where every player plays strategy 1 (each player pays 5). The game is a congestion game, 
and thus, by [18] , it is also a potential game and its potential function is: 

*(*)= E E*> 

jeGuH i=l 

where L x (j) is the number of players using facility j in configuration x. 



Stationary expected social welfare. It is easy to see that the update probabilities given by the 
logit dynamics for this game (see Equation (jTJ ) only depend on the number of players playing strategy 
1 and not on which player is actually playing that strategy. In particular, we have that, from a profile 
x, the player i, if selected for update, plays strategy with the following probabilities: 

<r«(0| |x_,l=0)= 1+ *_ 4g , <7i(0 I |x_il = 1) = 1+ \_ 2P , ^(0||x_ t | = 2) = i, (6) 

and strategy 1 with the remaining probabilities. 

Next theorem evaluates the stationary expected social welfare for this game. 

Theorem 6 (Expected social welfare) The stationary expected social welfare E^ [W] of the logit dy- 
namics for the CK game is 

Proof. We notice that two profiles with the same number of players playing strategy 1 have both the 
same potential (and, by Equation ©, the same stationary distribution) and the same social welfare. 
Thus, 7r(x) = Tr[k] and W(x) = W[k) for a profile x such that |x| = k, where 

e -6/3 -Wf3 „-125 

*M = ' = ^7T1K > t[2] = t[3] 



Z({3) ' L J Z{[3) ' 1 J L J Z{fi) ' 
where Z{fi) = e^ + 3e- 1013 + 4e" 12/J , and 

W[0] = -6, W[l] = -13, W[2] = -W, W[3] = -15. 
Hence, the stationary expected social welfare is 

6 • e- 6/3 + 3 ■ 13 ■ e- 10 ? + (3 ■ 16 + 15) ■ e" 12 ^ 6 + 39e" 4 ' 3 + 63e" 6/3 



E ff [W] 



e-W + Ze- 10 ? + Ae- 12 P 1 + 3e-^ + Ae~^ 



□ 
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Notice that for (3 = we have [W] — —27/2, which is better than the social welfare of the worst 
Nash equilibrium. This means that, even if each player selects her strategy at random, the logit dynamics 
drives the system to a random profile whose expectation to be better than the worst Nash equilibrium. 
We also observe that E^ [W] increases with j3 and thus the long-term behavior of the logit dynamics 
gives a better social welfare than the worst Nash equilibrium for any j3 ^ 0. Moreover, the stationary 
expected social welfare approaches the optimal social welfare as j3 tends to oo. 

Mixing time. Now we study the mixing time of the logit dynamics for the CK game and we show that 
it is bounded by a constant for any f3 0. The proof will use Coupling Theorem (see Theorem [1}. 

Theorem 7 (Mixing time) There exists a constant t such that the mixing time t m i x of the logit dy- 
namics of the CK game is upper bounded by r for every /3 ^ 0. 

Proof. First, we notice that the update probabilities given in Equation ^ imply that 

Vi,Vx,V/3, o*(0|x) > 1/2. (7) 

Let X t and Y t be two copies of the logit dynamics for the CK game, starting in x and y respectively, 
coupled as described in Section [3T2l It is easy to check that, by Equation (0, the player selected for 
update, chooses strategy in both chain with probability at least 1/2. 

Finally, we bound the probability that after three steps the two coupled chains coalesce: it is at least 
as large as the probability that we choose three different players and all of them play strategy at their 
turn, i.e. 

Px, y PC3 = r 3 )^-. -■- = -. 

Since this bound holds for every starting pair (x, y) , we have that the probability the two chains have 
not yet coalesced after 3t steps is 

Px,y (X 3t ? Y 3t ) < (l - ^ ^ 6 ~ t/36 ■ 

The thesis follows from the Theorem [TJ □ 



5 Two player games 

In this section we analyse the performance of the logit dynamics for 2x2 coordination games (the same 
class studied in [3]) and 2x2 anti-coordination games. 

Coordination games. Coordination Games are two-player games in which the players have an advan- 
tage in selecting the same strategy. These games are often used to model the spread of a new technology 
[2"2"] : two players have to decide whether to adopt or not a new technology. We assume that the players 
would prefer choosing the same technology as the other one and that choosing the new technology is at 
most as risky as choosing the old one. 

We name the NEW strategy and 1 the OLD strategy. The game is formally described by the following 
payoff matrix 








1 





(a, a) 


(c,d) 


1 


(d,c) 


(6,6) 



We assume that a > d and b > c (meaning that they prefer to coordinate) and that a — d ^ b — c 
(meaning that for each player strategy is at most as risky as strategy 1). Notice that we do not make 
any assumption on the relation between a and b. For convenience sake we name 

A := a — d and S := b — c . 

It is easy to see that this game is a potential game and the following function is an exact potential for it: 

$(0,0) = A $(0,1) = $(1,0) = $(1,1) = <5. 
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This game has two pure Nash equilibria: (0,0), where each player has utility a, and (1, 1), where each 
player has utility b. As d + c < a + b, the social welfare is maximized at one of the two equilibria. 

We analyse the mixing time of the logit dynamics for 2x2 coordination games and compute its 
stationary expected social welfare as a function of /3. 



Stationary expected social welfare. The logit dynamics for the coordination game defined by the 
payoffs in Table [5] establishes that, from a profile x, player i selected for update plays according to the 
following probability distribution (see Equation ([!])): 

Oi(0|x_i=0) = 1+e -A^i , <7i(l | X_i = 0) = JJ^KW , 

CTi(0 I x_< = 1) = y^w, I x_j = 1) = 1+e _^ . 

Next theorem bounds the stationary expected social welfare E T [W] obtained by the logit dynamics and 
gives conditions for which E„ [W] is better than the social welfare SW^r of the worst Nash equilibrium. 

Theorem 8 (Expected social welfare) The stationary expected social welfare E^ [W] of the logit dy- 
namics for the coordination game in Table is 

a + be-l*-W + (c + d)e-W 
[W] - 2 ' i + e -(A-^ + 2e-^ ' 

Moreover, if a ^ b then E T [W] SWjv for ft sufficiently large. 

Proof. The stationary distribution it of the logit dynamics is 

e A/3 8/3 1 

n(0,0) = — 7r (l,l) = - 7 - 7r(0,l) = 7 r(l ! 0) 



Z(/3) v ' 7 Z(/9) ' v Z(/3) 



= e^ + e^ 



where Z{j3) 

Since E^ [W] = 2 • E^ [ui], we compute the expected utility E^ [iii] of player i at the stationary 
distribution, 

E f M = X! w i( x M x ) 

xe{o,i} 2 

ae A/3 + be 5 ? +c + d 
~ e A/3 + e sp + 2 

a + be-( A - a ^ + (c + d)e- A ^ 
l + e-( A - 5 )' 3 + 2e- A ' 3 

Thus, if a > 6 and /3 ^ max jo, ^ log 2b ~^ d j , we have 

F rwl SW -2 a + & e - (A -^ + ( C + rfK A ^ 2 («-b)-(26-c-d) e - A ^ 

Similarly, we obtain E^ [W] - SWat ^ if b > a and (3 ^ max |o, £ log 2n fc ~ c ~ d |. □ 

Mixing time. Now we study the mixing time of the logit dynamics for coordination games and we 
show that it is exponential in /3 and in the minimum potential difference between adjacent profiles. 

Theorem 9 (Mixing Time) The mixing time of the logit dynamics for the coordination game of Ta- 
bled is 9 (e 513 ) for every (3 ^ 0. 

Proof. Upper bound: We apply the Path Coupling technique (see Theorem [2]) with the Hamming graph 
denned in ([5]) and all the edge- weights set to 1. Let x and y be two profiles differing only for the player 
j and consider the coupling defined in Section 13.21 for this pair of profiles. Now we bound the expected 
distance of the two coupled chains after one step. 
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We denote by 6,; (x, y) the probability that both chains perform the same update given that player i 
has been selected for strategy update. Clearly, 6j(x, y) = 1 for i = j, while for i ^ j, we have 

&;(x,y) = min{o-i(0 | x), (T 4 (0 | y)} + min{o-i(l | x), (T 4 (l | y)} 
1 1 



l + e A ^ 



1 



For sake of readability we set 



P = 



and 



q = 



a50 



and thus 6j(x, y) = p + q. To compute E XiY [p(X±, Yi)], we observe that the logit dynamics chooses 
player j with probability 1/2. In this (x, y) = 1, the coupling updates both chains in the same 

way, resulting in X\ = Y\. Similarly, player i ^ j is chosen for strategy update with probability 1/2. 
In this case, with probability fej(x, y) the coupling performs the same update in both chains resulting 
in p(Xi,Yi) = 1. Instead with probability 1 — 6j(x, y), the coupling performs different updates on the 
chains resulting in p(Xi, Yi) = 2. Therefore we have, 



E^ppf^Yi)] = l6 i (x,y) + 2-i(l-6 i (x,y)) 

= l-i6 i ( X) y) = l-i(p + g)<e-5(P+«) 



From Theorem [21 with a — h(p + q) and diam(fi) = 2, it follows that 



^mix(^) ^5 



2(log2 + log(l/e)) 



1 



P 



p + q 



log- 



Lower bound: We use the relaxation time bound (see Theorem [3]) . The transition matrix of the logit 
dynamics is 



P 



00 
01 
10 

V ii 



00 



01 10 



l-p p/2 p/2 




1L\ 



i-p 

2 

1-P 
2 





2 

4 s 





1-g 

2 
1-9 
2 



g/2 g/2 1 - g / 



It is easy to see that the second largest eigenvalue of P is A* 
f re i = 1/(1 — A*) = and for the mixing time we have 



(l-p) + (l-<?) 
2 ■ 



hence the relaxation time is 



1 2 - (p + q) , 1 
t mix(£ ) > (^-1) log- = —^108- 



1 , 1 
log — 

p + q 2e 



(9) 



In the last inequality we used that p and q are both smaller than 1/2. 
Finally, the theorem follows by observing that 



1 



1 



P + Q Y+jsF + y+^f 



9 (e^) . 



□ 

Notice that, if we used the relaxation time to upper bound the mixing time (see Theorem[3]) we would 
get a non-tight bound, hence in the above proof we had to resort to the path coupling for the upper 
bound. 
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Anti-coordination games. Very similar results can be obtained for anti-coordination games. These 
are two-player games in which the players have an advantage in selecting different strategies. They model 
many settings where there is a common and exclusive resource: two players have to decide whether to 
use the resource or to drop it. If they both try to use it, then a deadlock occurs and this is bad for both 
players. Usually, these games are described by a payoff matrix like the one in Table |8l where we assume 
that d > a and c > b and that d — a^c—b. Notice that Nash Equilibria of this game are unfair, as one 
player has utility max{c, d} and the other min{c, d}. 

For the logit dynamics, we have that, for all /3, the stationary expected social welfare is worse than 
the one guaranteed by a Nash equilibrium. On the other hand, for sufficiently large (3 we have that the 
expected utility of a player is always better than min{c, d}: that is, in the logit dynamics each player 
expects to gain more than in the worst Nash equilibrium. Moreover, the stationary distribution is a fair 
equilibrium, since every player has the same expected utility. As for the coordination games, the mixing 
time is exponential in f3 and in the minimum potential difference between adjacent profiles. 



6 The OR game 

In this section we consider the following simple n-player potential game that we here call OR game. 
Every player has two strategies, say {0, 1}, and each player pays the OR of the strategies of all players 
(including herself). More formally, the utility function of player i £ [n] is 



0, if x = ; 
— 1, otherwise. 



Notice that the OR game has 2™ — n Nash equilibria. The only profiles that are not Nash equilibria are 
the n profiles with exactly one player playing 1. Nash equilibrium has social welfare 0, while all the 
others have social welfare —n. 

In Theorem |TD] we show that the stationary expected social welfare is always better than the social 
welfare of the worst Nash equilibrium, and it is significantly better for large (3. Unfortunately, in 
Theorem [TT] we show that if /3 is large enough to guarantee a good stationary expected social welfare, 
then the time needed to get close to the stationary distribution is exponential in n. Finally, in Theorcm ll2l 
we give upper bounds on the mixing time showing that if [3 is relatively small then the mixing time is 
polynomial in n, while for large /3 the upper bound is exponential in n and it is almost-tight with the 
lower bound. Despite the simplicity of the game, the analysis of the mixing time is far from trivial. 

Theorem 10 (Expected social welfare) The stationary expected social welfare of the logit dynamics 
for the OR game is E„. [W] = —an where a — a(n, f3) = i+( 2 n -i)e-P • 

Proof. Observe that the OR game is a potential game with exact potential $ where <&(()) = and 
$(x) = —1 for every x ^ 0. Hence the stationary distribution is 



?r(x) = 



1/Z, ifx = 0; 

e-P/Z, ifx^O; 

where the normalizing factor is Z = 1 + (2™ — l)e - ^. The expected social welfare is thus 

E(1 n - 1 \p~P 
v ' v ' 1 + 2™ - l)e~P 

xS{0,l}" V 1 

□ 

In the next theorem we show that the mixing time can be polynomial in n only if f3 ^ clogn for 
some constant c. 

Theorem 11 (Lower bound on mixing time) The mixing time of the logit dynamics for the OR 
game is 

1. n(e") i//3<log(2»-l); 
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2. Q(2 n ) ifP> log(2" - 1). 

Proof. Consider the set S C {0, 1}™ containing only the state = (0, . . . , 0) and observe that tt(0) ^1/2 
for /3 ^ log(2" — 1). The bottleneck ratio is 

v ; ye{o,i}" ye{o,i}» : |y|=i 



Hence, by applying Theorem |4l the mixing time is 



— = i + e?. 



*(o) 



If jS > log(2™ — 1) instead we consider the set R C {0, 1}™ containing all states except state 0, and 
observe that 

^on (2"-l)e-* 



7r(i?) = -(2"-l)e 



Z v ; l + (2»-l)e-' 3 

and 7r(i?) s$ 1/2 for (3 > log(2" - 1). It holds that 



Z 71 1 



Q(R, R) = Y^ 7r ( x ) p ( x ' °) = X! n(x)P(x, 0) = ;/ 
xefl xe{o,i} n : |x|=i 



The bottleneck ratio is 



Q(fl,fl) Z e^^_ 1 

W 7r(i?) (2«-l)e-' 3 Z 1 + e-/ 3 (2» - 1)(1 + e~0) 



Hence, by applying Theorem |4l the mixing time is 

1 



> 2 r ' 



□ 



In the next theorem we give upper bounds on the mixing time depending on the value of /3. The theorem 
shows that, if j3 ^ clogn for some constant c, the mixing time is effectively polynomial in n with degree 
depending on c. The use of the path coupling technique in the proof of the theorem requires a careful 
choice of the edge- weights. 

Theorem 12 (Upper bound on mixing time) The mixing time of the logit dynamics for the OR 
game is 0(n 5 / 2 2 n ) for every j3. Moreover, for small values of (3 the mixing time is 

1. O(nlogn) if (3 < (1 — e) logn, for an arbitrary small constant e > 0; 

2. 0(n c+3 log n) if f3 $C clogn, where c 1 is an arbitrary constant. 

Proof. We apply the path coupling technique (see Theorem [2] in Section [5]) with the Hamming graph 
defined in ([5]). Let x,y 6 {0,1}™ be two profiles differing only at player j € [n] and, without loss of 
generality, let us assume |x| = k — 1 and |y| = k for some k = 1, . . . , n. We set the weight of edge {x, y} 
depending only on k, i.e. £(x, y) = 6k where 6k ^ 1 will be chosen later. Consider the coupling defined 
in Subsection 13.21 

Now we evaluate the expected distance after one step E x , y [p(Xi, Yi)] of the two coupled chains 
(X t ,Y t ) starting at (x, y). Let i be the player chosen for the update. Observe that if i = j, i.e. if we 
update the player where x and y are different (this holds with probability 1/n) , then the distance after 
one step is zero, otherwise we distinguish four cases depending on the value of k. 

Case k = 1 : In this case profile x is all zeros and profile y has only one 1 and the length of edge {x, y} 
is £(x, y) = 6\. When choosing a player i ^ j (this happens with probability (n — l)/n), at the next step 
the two chains will be at distance 6% (if in both chains player i chooses strategy 0, and this holds with 
probability minlcr^O | x), <Tj(0 | y)}), or at distance 62 (if in both chains player i chooses strategy 1, and 
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this holds with probability minlcr^l | x), tr^l | y)}), or at distance 4 + S 2 (if player i chooses strategy 
in chain X\ and strategy 1 in chain Yi, and this holds with the remaining probability). Notice that, 
from the definition of the coupling, it will never happen that player i chooses strategy 1 in chain X\ and 
strategy in chain Y%, indeed we have that 

min{ < 7 i (0|x) )( 7 < (0|y)}=(T i (0|y) = \ and min^l | x), ^(1 1 y)} = a^l | x) = — ^ . (10) 

2 1 + e p 

Hence the expected distance after one step is 



n — 1 ( Si S2 
1 + e-P + ~2 



(11) 



Case k = 2 : In this case we have Xj = and yj — 1, there is another player h G [n] \ {j} where 
Xh = Vh = 1) an d for all the other players i G [n] \ {j, /i} it holds Xi = yi = 0. Hence the length of edge 
{x,y} is ^(x,y) = S 2 . 

When player h is chosen (this holds with probability 1/n) we have that ct/j(s|x) and dh(s\y) for 
s = 0, 1 are the same as in (jTUJ) . At the next step the two chains will be at distance 4 (if player h stays 
at strategy 1 in both chains), or at distance 4 (if player h chooses strategy in both chains), or at 
distance Si + S 2 (if player h stays at strategy in chain X\ and chooses strategy 1 in chain Y\). 

When a player i ^ {h,j} is chosen (this holds with probability (n — 2)/n) we have that <7i(0,x) = 
(7,(1, x) = <7i(0,y) = <7i(l,y) = 1/2. Thus in this case the two coupled chains always perform the same 
choice at player i, and at the next step they will be at distance S 2 (if player i stays at strategy in both 
chains) or at distance £3 (if player i chooses strategy 1 in both chains). 
Hence the expected distance after one step is 

E^MXuYi)] = l(l6 l + -^5 2 + (l-±--^)(6 1 +5 2 ))+^(~S 2 + h 



1 f 2 



-.4 + (n - 1)4 + (n - 2)4 . (12) 



2n \1 + e-P 

Case 3 ^ fc ^ n — 1: When a player i ^ j is chosen such that xi — yi = 1 (this holds with probability 
(fc — l)/n) then at the next step the two chains will be at distance Sk (if i stays at strategy 1) or at 
distance 4-1 (if i moves to strategy 0). When a player i ^ j is chosen such that Xi = yt = (this holds 
with probability (n — k) jn) then at the next step the two chains will be at distance Sk (if * chooses to 
stay at strategy 0) or at distance Sk+i (if i chooses to move to strategy 0). Hence the expected distance 
after one step is 

r , v .... fc-l/l 1 \ n-kfl; l r 
^WXuYi)] = — {-5 h + -6±. 1 )+—{-5 h + -5 1 + 1 

= ^ ((n - l)6 k + (fc - l)4-i + (n - fc)4+i) ■ (13) 

Case fc = re: When a player i 7^ j is chosen, then at the next step the two chains will be at distance S n 
or at distance 4-i- Hence the expected distance after one step is 

E x , y [p(X u Y 1 )] = ^(~5 n + ±S n - 1 } =^l( 5n + 5 n ^). (14) 

In order to apply Theorem [5] we now have to show that it is possible to choose the edge weights Si, . . . ,S n 
and a parameter a > such that 



Si 1 62 



% < Sie~ a 



£ + (»- 1)^2 + (n- 2)S 3 ) < 4e~", 

^ ((n - 1)4 + (fc - 1)4-1 + (n - k)5 k +i) < 4e~ Q , for fc = 3, . . . , n - 1 , 



(15) 



^(4 + 4-i) «S 6, 



e 



l(i 



For different values of /3, we make different choices for a and for the weights Sk- For clarity's sake we 
split the proof in three different lemmas. We denote by <5 max the largest Sk- 
in Lemma [T3] we show that Inequalities (1X5]) are satisfied for every value of (3 by choosing the weights 
as follows 

i[(n-l)<y 2 + l], iffc=l; 
Sk = I T^+i + !. if 2 < fc < n - 1; 

1, if fc = n; 

and by setting a = l/(2n<5 max ). From Corollary[THJ we have <5 max = 0(y / n2 n ). Observe that the diameter 
of the Hamming graph is J2?=i $i ^ 72(5 max , hence from Theorem [2] we obtain t mix = 0{n b / 2 2 n ). 

In LemmafMlwe show that, if /3 < (1 — e) logn for an arbitrarily small constant e > 0, Inequalities (fTS")) 
are satisfied, for sufficiently large n, by choosing weights 8\ — n l ~ e , 82 = 4/3,(53 = • • • = 8 n = 1, and 
a = I/72.. In this case the diameter is 0(n) and, by Theorem^ t m i x — 0(72 log 72). 

In Lemma [15] we show that, Inequalities (|15p are satisfied by choosing weights as follows 



l + e" 



if fc = 1; 



*fc = < fr^+i + !« if 2 < fc < rc - 1; 

1, if k = n; 

where ai = n — 1 and 61 = ne _/3 + 1 and, for every fc = 2, . . . , n — 1 

ojfe = (n - fc)fe fc _i 6 fe = (n + l)bjfe_i - (A; - l)a k -i ; 

and by setting a = l/(2n(5 max ). From Corollary CHI it follows that, if /3 ^ clog 72 for a constant c € N, we 
have that 8 max = 0(n c+2 ) and the diameter of the Hamming graph is 0(n c+3 ). Thus, by Theorem[2]it 
follows that tmix = 0(n c+3 log 72,). □ 

6.1 Technical lemmas 

In this section we prove the technical lemmas needed for completing the proof of Theorem [T2j 

Lemma 13 Let 8\, . . . , 8 n be as follows 

±[(n-l)<5 2 + l], if fc = 1; 

S k = I nr**+i + 1 > if 2 < fc < n - 1; (i 6 ) 

1, if fc = n; 

and let a — l/(2nc) max ) where <5 max = max{<5fc : k = 1, . . . , n}. Then Inequalities U5\) are satisfied for 
every j3 0. 

Proof. Observe that, for every fc = 1, . . . , n, the right-hand side of the fc-th inequality in f| L5[) is 

S k e- a = 4e~ 1/(2 " 5max) >S k (l- =S k - 7—^- — >S k -±. (17) 



2n5 max J 2n<5 max 2n 

Now we check that the left-hand side is at most Sk — l/(2n 
First inequality (fc = 1): 2=* ( r ^ =y + 4f ) < <5 ie - a . 

25i - 1 
n — 1 



From the definition of <5i in (| L6[) we have that 
Hence the left-hand side is 



61 ^^(8 1 + S 4)=^(8 1 + ^l)=h2n8 1 -l) = 8 1 1 



72 



l + e-P 2 J n V 2 / « V 2 ( n - 1) / 2 " 2 ™ 
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Second inequality (fc = 2): i (j+f^-4 + (ra - 1)4 + (n - 2)<J 3 ) < 4e~ Q . 
From the definition of 4 in (|16[) we have that 

4 = -^(4-1). 
n — 2 

Hence the left-hand side of the second inequality is 

-!- ( 2 61 + {n- 1)4 + (n - 2)4) < (24 + (n - 1)4 + (n - 2)4) 



2n V 1 + e-^ v ; v ' J 2n 

= _L (( n - 1)4 + 1 + (n - 1)4 + 2(4 - 1)) 

2n 

Other inequalities (fc = 3, . . . , ti — 1): 5^ ((n — 1)4 + (fc — 1)4-1 + (n — fc)4+i) ^ 4e~ Q . 
From the definition of 5i- in (|16[) we have that 

fc 

4+i = r(^fc - 1) ■ 

n — k 

Hence the left-hand side is 
-!- ((n - 1)4 + (fc - l)4-i + (n - fc)4+i) = 77- ((« - l)<*k + (»-*:+ 1)4 + (fc - 1) + fc4 - 

= ^(2n4-D = 4-^- 
Last inequality (fc = n): (4 + 4-i) ^ S n e~ a . 

Since S n — 1 and 4-i = ~rS n + 1 = ""ZTj the left-hand side of the last inequality is 

n — 1 ,. . . n — 1 71 1 

(4 + 4-i - -5— 1 + 7 = 1 - 5- ■ 

2n 2n n—1 2n 



Lemma 14 Let 5%, . . . , 4 fl s follows 

4 = 4 = 4/3, 4 = • • • = 4 = 1 

where e > is an arbitrary small constant and let a = l/n. Then Inequalities \15\) are satisfied for 
/3 $C (1 — e) log 71 otic? n sufficiently large. 

Proof. We check that all the inequalities in (fT5|) are satisfied. 

First inequality (fc = 1): *=± + f ) < 4e~ a . 

For the left-hand side we have 

« - 1 ( 5 i M _ ( 1\ ( n 1 - 6 , 2 



1 + e-^ 2 / V n / V 1 + e~P 3 



1 



1\ /(n 1 - £ + l)(n 1 - £ -l) + l 2 s 



7i / V n 1 " 6 + 1 3 



n/ V ra 1_e + 1 3 

For the right-hand side we have 

4e" a = n 1 "^^ 1 /" ^ n 1 ^ f 1 - - 

71 
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Hence the left-hand side is smaller than the right-hand one (for n sufficiently large). 
Second inequality (fc = 2): i (p^vh + {n- 1)6 2 + (n- 2)5 3 ) «S S 2 e~ a . 



For the left-hand side we have 



1 

~2~k V 1 



J S 1 + (n-l)5 2 + (n-2)S 3 ) = — 



1 

2n \ l + e~P' 



+ (n-l)- + (»-2) 

7 1 

6 n e 



And for the right-hand side we have 



5 2 e~ a 



3 3 I n 



1 

n 



Hence the left-hand side is smaller than the right-hand one (for n sufficiently large). 
Third inequality (fc = 3): ^ ((n - l)5 3 + 2S 2 + (n - 3)<5 4 ) «S 6 3 e~ a . 



For the left-hand side we have 
1 



0n ((n-l)S 3 + 25 2 + (n-3)S 4 ) = — 



(n-l)+2- + (n-3) 



1 



2n 



(2n - 3) 1 - 



And for the right-hand side we have 



5 3 e~ a = e-V" ^ f 1 - I 



Hence the left-hand side is smaller than the right-hand one. 

Other inequalities (fc ^ 4): ^- ((n — l)o~fc + (fc — l)5k-i + (n — k)Sk+i) ^ 5k&~~ a - 

Since 5f- = 5k-i = Sk+i = 1 the left-hand side is equal to and the right-hand side is e -1 /" ^ □ 
Lemma 15 Let 5%, . . . , 8 n be as follows 



l+e~ 



fr^+i + i, 
i, 



if fc = 1; 

if 2 < fc < n- 1; 

if fc = n; 



(18) 



where di = n — 1 and &i = ne ^ + 1 and /or every fc = 2, . . . , n — 1 

aft = (n - fc)o fc _i and 6 fc = (n + l)o fe _i - (fc - l)a fc _i 



and Zet a = l/(2n<5 max ) w/iere o" n 
even/ /3 ^ 0. 



maxjcSfc : fc = 1, . . . , n}. Then Inequalities H5\) are satisfied for 



Before to prove the Lemma [151 we do the following observation. 

Observation 16 Let b^ defined as in the Lemma \Tb\ Then, for every k 2, it holds that b^ kb^-i- 

Proof. We proceed by induction on fc. The base case fc = 2 follows from 

b 2 = {n + l)(ne~ p + 1) - (n - 1) = (n + l)ne~' 3 + 2 > 2(ne _/3 + 1) = 2&i . 

Now suppose the claim holds for fc — 1, that is bk-i ^ (fc — l)6fc_2- Then 

Ofe = (n+ l)&fe_i - (fc - l)fflfc_i 

= (n+l)6 fc _i-(fc-l)(n-fe + l)6 fc _2 
^ [(n + 1) - (n - fc + 1)] 6 fc _! = fc& fc _i . 
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□ 

Proof (Lemma\15\). Observe that, as in Equation (I17p . for every k = 1, . . . , n, the right-hand side of the 
fc-th inequality in (|T5|) is 

2n 

Now we check that the left-hand side is at most 4 — V (2n) . 

First inequality (fc = 1): 2=i + f ) < 4e~ Q . 

From the definition of 5\ in (|18D we have that 



ne 



1 / 24 



n-1 \l + e~^ 



Hence the left-hand side is 



n-1 



4 4 

1 + e-P 2 



n — 1 



4 



+ 1 / Ji 



1 + e-P n-1 V 1 + e"* 8 2 



n — 1 4 



< 4 



n 1 + e-* 3 
1 



"" + 1 



n — 1 



"" + 1 



2n 



2;; 



Second inequality (fc = 2): ^_ (j+f^i + (n - 1)4 + (n - 2)£ 3 ) ^ 4e~ Q . 
From the definition of 4 in (|18p we have that 



63 = -(6 2 - 1) 
a 2 



(n + — 01 
(n- 2)6i 



(*2 - 1) 



Hence the left-hand side is 



_(_^ 1 + („-l)fc + („-2)*, 



1 

2n 



^-<5 2 + l j („ - L)4 



1 nbi - a! 1 , 

02 — ; = 02 — — I n 

2n &i 2n 



(n + — ai 
n — 1 



(4 - 1) 



< (5, 



1 

2n 



Other inequalities (fc = 3, . . . , n — 1): ^ ((n — 1)4 + (fc — 1)4-1 + (n — fc)4+i) ^ 4e a . 
From the definition of 5k in (|18[) we have that 

4+1 = — (4 - 1 = 7 7T7 (4 - 1 ■ 

a fc (n-fc)6 fe _i 



Hence the left-hand side is 
1 



272 



((n - 1)4 + (fc - 1)4-1 + (n - fc)4+i) = 



1 

2n 



(n - 1)4 + (fc - 1) ( ^4 + 1 

Ok-! 



(n + l)4_i - (k - l)afc_i . . 
r (4 - 1) 

Ofc-l 



= 4 
= 4 
< 62 



1 (n - fc + 2)4-1 - (fc - l)afc-i 



2?? 
1 

2n 
1 

2n 



4-i 

(n-fc + 2)-(fc-l)(n-fc + l) 



4- 



4- 



where the inequality follows from the Observation 1161 
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Last inequality (k = n): \^-{S n + 5 n -i) ^ a 

a ^-l ii I 1 _ On 

6„_i °" + 1 6„ 



Since <5„ = 1 and <5„_i = ?" 1 <5 n + 1 = ?" 1 + 1, the left-hand side of the last inequality is 



-(*„ + *„-!) = -s— 2 + t =-5— 2 



2n 2n \ &n— i / 2n \ &n— l 

n- 1 / 1 \ 1 

< 2 + =1 . 

2n \ n-lj 2n 

where the inequality follows from the Observation [TH] □ 
In order to apply the path coupling theorem, we need to bound <5 m a X : the next observation will 
represent the main tool to achieve this goal. 

Observation 17 Let 8x,...,S n be defined recursively as follows: S n = 1 and 

4 = jkSk+i + 1 , 

where "f k > for every k = 1, ... ,71 — 1. Let (5 max = max{(5fc : k = 1, . . . , n}. Then 



(5 max < nmax < fl 7 ( : l</i<j<n-l 







nmax 








L«/2J 


n — i 
i 


i=i 



\ i—h ) 

Proof. The observation follows from the fact that, for k = 1, . . . , n — 1, we have 

n—i j 

□ 

Corollary 18 Let 5i, . . . , S n be defined as in Lemma \TM Then J max ^ cy / n2" for a suitable constant c. 
Proof. From Observation [171 and the definition of Si, . . . , S n , it holds that 

n — i 

: U H j ^ n 

J 

^/2j)^ r - 

for a suitable constant c. □ 
In order to bound <5 max when Si, . . . ,S n are defined as in Lemma [15] and (3 c log n for a constant 
c £ N, we define 

7,^ = ^^. (19) 

bk q k e p +r k 

You can check that pi = 0, qi = n and 

Pk = (n - k)qu-i qk = (n + - (fe - l)j>k-i ; 

we notice that = (n + l)q k -i — [k + l)Sfe-i ^ 9fc for every k. We can also prove the following simple 
observation about q k . 

Observation 19 For every k 1 constant, we have q k 2~ k n k . 

Proof. We proceed by induction on fc, with the base fc = 1 being obvious. Suppose the claim holds for 
k - 1, that is <? fc _i 2-( fe - 1 )n fe_1 , then 

q k = (n + l)q k -i - (k - l)p k -i > ^<7fe-i > 2" fc n fc . 
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□ 

Moreover, you can check that l\ = n — 1, r\ = 1 and 

l k = (n- fc)r fc _i r k = (n + l)r k -i - {k - l)h-i I 

we notice that above recursion gives l k — (n — k)(k — 1)! and r k = k\. Next lemma bounds 7& defined in 
Equation [191 

Lemma 20 Let 8% , . . . , 5 n be defined as in Lemma \15l ~f k defined as in Equation h!9\) and /3 ^ c log n 
for a constant c£N. Then, for sufficiently large n, it holds that 

Vfc; 

if k > c + 2; 



7fc 


< n 


7fc 


< 1 


7c+2 


= O(l) 



Proof. Since ^ then (nq k — p k )e~@ > 0; instead, l k — nr k = (k — l)!(n — k — nk) < 0. Hence we 
have for every k 

p k e~ p + l k (h ~ nr k ) - (nq k - p k )e~ P 
lk~n= -H- n= -5- < . 

9fce p + r fc g fc e P + r k 

Inductively, we show that for every k c + 3, we have 7^ < 1. Set k = c + 3: c is a constant, thus 
Observation Q2J] holds for fc — 1; hence and since e"' 3 ^ n~ c , we have that 

(g c+3 - p c +3)e- p = [(n + l)<? c+2 - (c + 2)p c+2 - (n - c - 3)< Zc+2 ] e -' 3 ^ 2< Zc+2 e-' 3 ^ 2-< c+1 V. 

Instead, l c+3 - r c+3 = (c + 2)!(n - 2c - 6) < (c + 2)! • n. Thus, 

(f e +3-r e+3 )-(g c +3-p c +3)e-^ . (c + 2)! ■ n - 2~( C + 1 V 

7c+3 - 1 = ZJ-j ^ Zfl- < u > 

for n sufficiently large. Now, suppose that 7fc_i < 1; then, we have 

, a k -b k (k - l)dk-i - (k + l)6fc_i n 
7fc - 1 = — 7 = 7 < > 

Ofe Ofe 

where a/j-i < is implied by the inductive hypothesis. 

In order to complete the proof, we need to show that 7 c + 2 = 0(1). Similarly to the case k = c + 3, 
we obtain {q c +2 — Pc+2)^^ ^ 2~ c n and l c+ 2 — r c+2 < (c + 1)! • n. Hence, 

> +2 < Pc+2 + r ; +2 + ( :t 1 c )! '" < (c + 1)! • 2< = (9(1) . 
Pc+ 2 + r c+2 + 2 c n 



a 



Corollary 21 Let 61, . . . ,S n and c be defined as in Lemma\TS[ Then S max = 0(n c+2 ). 
Proof. From Observation [T71 Lemma |2"01 and the definition of 81, . . . , 6 n it follows that 

<5 max < n max { TT — : l</i^j<nl 



£= 1 



n c+2 ) 



□ 



22 



7 The XOR game 



In this section we analyze the logit dynamics for another simple n-player game, the XOR game. The 
XOR game is a symmetric n-player game in which each player has two strategies, denoted by and 1, 
and each player pays the XOR of the strategies of all players (including herself). More formally, for each 
i € [n], the utility function m(-) is defined as follows 



Uj(x) 



-1, if x has an odd number of l's; 
0, if x has an even number of l's. 



Notice that the XOR game has 2 n_1 Nash equilibria, namely all profiles with an even number of players 
playing strategy 1. Nash equilibria have social welfare and profiles not in equilibria have social welfare 
— n. Observe that the XOR game is a potential game with exact potential $ where ^(x) = Uj(x) for 
every x and every i € [n]. Hence, the stationary distribution is 



tt(x) 




if x has an odd number of l's; 
if x has an even number of l's; 



where the normalizing factor is Z — 2™~ 1 (1 + e _/3 ). 

Even if this game looks similar to the OR game, it exhibits a different behavior. Theorem |2"21 gives 
the stationary expected social welfare of the XOR game and we can see that, as j3 increases, the expected 
social welfare tends from below to the social welfare at the Nash equilibria. In contrast the expected 
social welfare of the OR game is better than the worst Nash equilibrium for all values of j3. Moreover, in 
Theorem [22] and Theorem [M] we show that the mixing time for the XOR game is polynomial in n and 
exponential in /3, whereas the mixing time for the OR game can be bounded independently from /3. 

Theorem 22 (Expected social welfare) The stationary expected social welfare of the logit dynamics 
for the XOR game is E ff [W] — — ■ 

Proof. The expected social welfare is 

E2"~ 1 e - ^ n 
W(x)7t(x) = -Tl ■ T— 77T = a- 
x ' K ' 2"- 1 (l + e-P) l + e$ 
xe{o,i}" v ' 



□ 

The next theorem shows that the mixing time is exponential in fj for every (3 > 0. 

Theorem 23 (Lower bound on mixing time) The mixing time of the logit dynamics for the XOR 
game is ^l(e^). 

Proof. Consider the set 5* C {0, 1}" containing only the state = (0, ... ,0). Observe that 7r(0) ^ 1/2. 
The bottleneck ratio is 

E <0)P(0,y)= £ P( 0l y) = n.i. T ^ ? . 

v ; ye{o,i}" ye{oa}":|y|=i 

Hence, by applying Theorem 01 the mixing time is 

. . 1 



mlx " B(0) 

□ 

Finally, in the next theorem we give an almost matching upper bound to the mixing time. 

Theorem 24 (Upper bound on mixing time) The mixing time of the logit dynamics for the OR 
game is 0(n 3 e^). 

The theorem is proved using coupling (see Theorem [1} and proof is presented in the next sections. 
Specifically, we use the coupling described in Section 13.21 in Section 17.11 we show that if the coupled 
chains are at even distance then distance does not increase after one step of the coupling; in Section 17.2 
we show that if the coupled chains are at odd distance then they get closer distance with probability 
independent from /3; finally, in Section 17.31 we bound the expected time needed by the two chains to 
coalesce and use Theorem [T] to derive an upper bound for the mixing time. 
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7.1 Even Hamming distance 

Let X t and Y t be two chains coupled as described in Section 13.21 Suppose that X t = x, Y t = y, and 
£T(x, y) = 21, for t > 0. In this case, ttj(x) = itj(y) = b for all i G [n] and some 6 G {— 1, 0}. 

Let i be the index selected for update and let us distinguish two cases. In the first case Xi = yi and 
we have 

Ui(x_j,0) = Ui(y_i,0) and Ui(x_j, 1) = «»(y_i, 1) 

and thus 

a,(0 | x) = a,(0 | y) and <j;(l | x) = <r;(l | y). 

Therefore the coupling always update the strategy of player i in the same way in the two chains and 
thus H(X t+1 ,Y t+1 ) =21 

In the second case we have Xi ^ yi and we assume, without loss of generality, that Xi = and j/j = 1. 
We observe that, for 6 G { — 1,0}, 

Mi(x_ j; ,0) = Ui(y-i, 1) = 6 and Uj(y_i, 0) = «i(x_j, 1) = -(1 + 6). 

Therefore we have 

o-<(0 1 x) = a,(l 1 y) = l + e l (1+2b)p and a,(l [ x) = ^(0 [ y) = ^ + J 1+2b)p 

and thus we have three possible updates for the strategy of player i: 

1. both chains update to (and thus H(X t +i, Y t +i) = 2i — l) with probability 

1 111 



l + e (l+26)£> 1 + e -(l+26)/3j l + e /3' 

2. both chains update to 1 (and thus H(X t +i,Y t+ i) =21—1) with probability 

1 1 



1 + e (l+26)/3' 1 + e -(l+26)/3 J l + e /3' 

3. chain X and F choose two different strategies for updating the strategy of player i (and thus 
H(X t+1 ,Y t+1 ) = 21) with probability 

2 

The following lemma summarizes the above observations. 
Lemma 25 Suppose that H(X t ,Y t ) = 21, for £>0. Then 

21 — 1, wi£/i probability ^ ■ j 



H(X t+ i,Y t+1 ) = { 



21, with probability 1 



21 _ 2 
n ' 1+ef 



7.2 Odd Hamming distance 

Let X t and be two chains coupled as described in Section 13.21 Suppose that X t = x, Yt = y, and 
H(x,y) =21—1, for I > 0. In this case we have Uj(x) = 6 and tij(y) = —(1 + b) for some 6 G {—1,0}. 
Let i be the index selected for update and let us distinguish two cases. 
In the case in which Xi = yi = c for some c G {0, 1}, we have 

U,-(x_i,c) = ttj(y_j, 1 - c) = 6 and Uj(x_j, 1 - c) = Uj(y_j, c) = -(1 + 6). 

Therefore 

g t (c | x) = g. t (l -c | y) = _! 14 . 9M o and o;(l - c | x) = cr<(c | y) 



1 + e -(l+26)/3 "i-v — 1 J'y 1 + e (i+2b)/3 

and thus we have three possible updates: 
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1. both chains update to c (and thus H(X t +i, lt+i) = 2£ — 1) with probability 

1 111 



1 + e -(l+26)/3' 1 + e (l+2fc)/3 J l + e /3' 

2. both chains update to 1 — c (and thus H(X t +i, Yt+i) =2£ — 1) with probability 

1 111 



1 + e -(l+26)/3' 1 + e (l+26)/3 j l + e* 3 ' 

3. chains X and Y choose two different strategies for updating the strategy player i (and thus 
H(X t+1 ,Y t+1 ) = 21) with probability 1 - lq ^ r . 

In the second case we have x.; ^ and we assume, without loss of generality, that Xi = and yi = 1. 
We observe that 

u 4 (x_i,0) = u 2 (y_j,0) = 6 and Ui(x_ 4 , 1) = u,(y_j, 1) = -(1 + 6). 
Therefore we have 

<7i(0 I x) = a j; (0 I y) and a 4 (l | x) = a,(l | y) 

and thus in this case H (X t +\, Yt+i) — 21 — 2. 

The following lemma summarizes the above observations. 

Lemma 26 Suppose that H(X u Y t ) = 21 - I, for I > 0. Then 

'21—2, with probability 

H(X t+1 ,Y t+1 ) = he-l, with probability n ~„ +1 j^gr] 

21, with probability {\ - . 

7.3 Time to coalesce 

We denote with Tk the random variable indicating the first time at which the two coupled chains have 
distance k. More precisely, 

T k = min{t:H(X u Y t ) = k}. 

Therefore, T coup i c = t is the time needed for the two chains to coalesce. We next give a bound on the 
expected time E x , y [r coup i c ] for the two chains to coalesce starting from x and y. If x and y have distance 
2£, we denote by \n the expected time to reach distance 2£— 2. That is, 

\il = E x ^ y [T21-2] ■ 

Similarly, if x and y have distance 2i — 1, we denote by vi the expected time to reach distance 21 — 2. 
That is, 

vi = E x ^ y [T21-2] ■ 

Notice that, if £T(x,y) = H(x',y') then 

E x ,y [Tk] = E x ', y ' [Tk] 

for all fc, and thus the m and vi are well defined. 

From Lemma [55] and Lemma 1261 we have the following relations 

/ 21 2 \ 21 2 

(ii = 1 + m ■ 1 q — ; — 77 \+Vi- 



1 + eP J n l + e? 

n-2£+l 2 n - 21 + 1 , 
ut = l + vi • — — 77 + M ■ 1 - 
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Simple algebraic manipulations give 



21- 1 



21 + 1 e p -1 



21 



and 



n 1 + e>- 



21 2 
ii n ( n 



s"-l 



2£-l 21 \ 21 - 1 2 
n / n e° — 1 



2£ - 1 V 2i - 1 
e"-l 



< n n 



Hence, 



E x , y [r coup le] < 1 + W ^ T ( " ' """""o - " + 2 ) + 1 = K^) • 



t even 



From Markov inequality we have that 

p !t +\ <■ -^ x .y [ T coupic] 

■^x.y ^'couple ^ ''y ^ , 

and thus, by taking to = 4E XiY [r CO upie], we have d(to) ^5 1/4. Therefore, by using Theorem [TJ we have 
that 

t mix = O (n 3 e^) . 



8 Conclusions and open problems 

In this paper we studied strategic games where at every run a player is selected uniformly at random 
and she is assumed to choose her strategy for the next run according to the logit dynamics: a noisy 
best-response dynamics where the noise level is tuned by a parameter (3. Such dynamics defines a family 
of ergodic Markov chains, indexed by j3, over the set of strategy profiles. 

We proposed the stationary distribution of these Markov chains as solution concept for games where 
players have bounded rationality or limited knowledge about the system. Since this solution concept 
does not assume full rationality of agents, it avoids one of the main drawbacks of many classical equilibria 
concepts. Moreover, the stationary distribution of an ergodic Markov chain always exists, it is unique, 
and the chain converges to such a distribution from any starting state. 

In order to evaluate the long-term performance of the system, on the one hand we analyzed the 
expected social welfare when the strategy profiles are random according to the stationary distribution, 
on the other hand we studied the mixing time, i.e. how long it takes, for a chain starting at an arbitrary 
profile, to get close to its stationary distribution. 

In this paper we applied this approach to some simple but well-studied games with a constant number 
of players: the CK game, that obtains the worst Price of Anarchy bound between linear congestion games, 
and the 2x2 coordination games considered in the seminal paper about logit dynamics [3 . We also 
considered two simple n-player games, the OR game and the XOR game: the analysis of the mixing 
time turned out to be far from trivial even for such simple games. The above games highlight a twofold 
behavior: for some games, namely CK game and OR game, the mixing time can be upper bounded by a 
function independent of /3, whereas the mixing time for the other games depends exponentially on the 
noise parameter /3. 
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The main goal of our line of research is to investigate logit dynamics for notable classes of n-player 
games. It would also be interesting to consider variations of the logit dynamics where players update 
their strategies simultaneously or where the noise is not uniform between players. 

We have seen that, for some games and for some values of /3, the mixing time can be exponential in 
the number of players. When it takes such a long time to reach the stationary distribution, it would be 
interesting to investigate the evolution of the system in the transient phase of the logit dynamics. 
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